Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix #204] Eliminate delay between binding and log checking #441

Closed
wants to merge 2 commits into from

Conversation

andrewor14
Copy link
Contributor

Bug: In the existing history server, there is a spark.history.updateInterval seconds delay before application logs show up on the UI.

Cause: This is because the following events happen in this order: (1) The background thread that checks for logs starts, but realizes the server has not yet bound and so waits for N seconds, (2) server binds, (3) N seconds later the background thread finds that the server has finally bound to a port, and so finally checks for application logs.

Fix: This PR forces the log checking thread to start immediately after binding. It also documents two relevant environment variables that are currently missing.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14225/

asfgit pushed a commit that referenced this pull request Apr 22, 2014
**Bug**: In the existing history server, there is a `spark.history.updateInterval` seconds delay before application logs show up on the UI.

**Cause**: This is because the following events happen in this order: (1) The background thread that checks for logs starts, but realizes the server has not yet bound and so waits for N seconds, (2) server binds, (3) N seconds later the background thread finds that the server has finally bound to a port, and so finally checks for application logs.

**Fix**: This PR forces the log checking thread to start immediately after binding. It also documents two relevant environment variables that are currently missing.

Author: Andrew Or <andrewor14@gmail.com>

Closes #441 from andrewor14/history-server-fix and squashes the following commits:

b2eb46e [Andrew Or] Document SPARK_PUBLIC_DNS and SPARK_HISTORY_OPTS for the history server
e8d1fbc [Andrew Or] Eliminate delay between binding and checking for logs
(cherry picked from commit 745e496)

Signed-off-by: Patrick Wendell <pwendell@gmail.com>
@asfgit asfgit closed this in 745e496 Apr 22, 2014
@andrewor14 andrewor14 deleted the history-server-fix branch April 23, 2014 00:48
pwendell added a commit to pwendell/spark that referenced this pull request May 12, 2014
GraphX shouldn't list Spark as provided.

I noticed this when building an application against GraphX to audit the released artifacts.
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
**Bug**: In the existing history server, there is a `spark.history.updateInterval` seconds delay before application logs show up on the UI.

**Cause**: This is because the following events happen in this order: (1) The background thread that checks for logs starts, but realizes the server has not yet bound and so waits for N seconds, (2) server binds, (3) N seconds later the background thread finds that the server has finally bound to a port, and so finally checks for application logs.

**Fix**: This PR forces the log checking thread to start immediately after binding. It also documents two relevant environment variables that are currently missing.

Author: Andrew Or <andrewor14@gmail.com>

Closes apache#441 from andrewor14/history-server-fix and squashes the following commits:

b2eb46e [Andrew Or] Document SPARK_PUBLIC_DNS and SPARK_HISTORY_OPTS for the history server
e8d1fbc [Andrew Or] Eliminate delay between binding and checking for logs
andrewor14 pushed a commit to andrewor14/spark that referenced this pull request Jan 8, 2015
GraphX shouldn't list Spark as provided.

I noticed this when building an application against GraphX to audit the released artifacts.
(cherry picked from commit 5fecd25)

Signed-off-by: Patrick Wendell <pwendell@gmail.com>
bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019
Fix error of running trovestack in openlab job
arjunshroff pushed a commit to arjunshroff/spark that referenced this pull request Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants